Journals
  Publication Years
  Keywords
Search within results Open Search
Please wait a minute...
For Selected: Toggle Thumbnails
Hadoop adaptive task scheduling algorithm based on computation capacity difference between node sets
ZHU Jie, LI Wenrui, WANG Jiangping, ZHAO Hong
Journal of Computer Applications    2016, 36 (4): 918-922.   DOI: 10.11772/j.issn.1001-9081.2016.04.0918
Abstract507)      PDF (783KB)(460)       Save
Aiming at the problems of the fixed task progress proportions and passive selection of slow tasks in the task speculation execution algorithm for heterogeneous cluster, an adaptive task scheduling algorithm based on the computation capacity difference between node sets was proposed. The computation capacity difference between node sets was quantified to schedule tasks by fast and slow node sets, and dynamic feedback of nodes and tasks speed were calculated to update slow node sets timely to improve the resource utilization rate and task parallelism. Within two node sets, task progress proportions were adjusted dynamically to improve the accuracy of slow tasks identification, and the fast node which executed backup tasks dynamically for slow tasks by substitute execution implementation was selected to improve the task execution efficiency. The experimental results showed that, compared with the Longest Approximate Time to End (LATE) algorithm, the proposed algorithm reduced the running time by 5.21%, 20.51% and 23.86% respectively in short job set, mixed-type job set and mixed-type job set with node performance degradation, and reduced the number of initiated backup tasks significantly. The proposed algorithm can make the task adapt to the node difference, and improves the overall job execution efficiency effectively with reducing slow backup tasks.
Reference | Related Articles | Metrics
Resource matching maximum set job scheduling algorithm under Hadoop
ZHU Jie, LI Wenrui, ZHAO Hong, LI Ying
Journal of Computer Applications    2015, 35 (12): 3383-3386.   DOI: 10.11772/j.issn.1001-9081.2015.12.3383
Abstract613)      PDF (725KB)(332)       Save
Concerning the problem that jobs of high proportion of resources execute inefficiently in job scheduling algorithms of the present hierarchical queues structure, the resource matching maximum set algorithm was proposed. The proposed algorithm analysed job characteristics, introduced the percentage of completion, waiting time, priority and rescheduling times as urgent value factors. Jobs with high proportion of resources or long waiting time were preferentially considered to improve jobs fairness. Under the condition of limited amount of available resources, the double queues was applied to preferentially select jobs with high urgent values, select the maximum job set from job sets with different proportion of resources in order to achieve scheduling balance. Compared with the Max-min fairness algorithm, it is shown that the proposed algorithm can decrease average waiting time and improve resource utilization. The experimental results show that by using the proposed algorithm, the running time of the same type job set which consisted of jobs of different proportion of resources is reduced by 18.73%, and the running time of jobs of high proportion of resources is reduced by 27.26%; the corresponding percentages of reduction of the running time of the mixed-type job set are 22.36% and 30.28%. The results indicate that the proposed algorithm can effectively reduce the waiting time of jobs of high proportion of resources and improve the overall jobs execution efficiency.
Reference | Related Articles | Metrics
Three-queue job scheduling algorithm based on Hadoop
ZHU Jie ZHAO Hong LI Wenrui
Journal of Computer Applications    2014, 34 (11): 3227-3230.   DOI: 10.11772/j.issn.1001-9081.2014.11.3227
Abstract184)      PDF (756KB)(524)       Save

Single queue job scheduling algorithm in homogeneous Hadoop cluster causes short jobs waiting and low utilization rate of resources; multi-queue scheduling algorithms solve problems of unfairness and low execution efficiency, but most of them need setting parameters manually, occupy resources each other and are more complex. In order to resolve these problems, a kind of three-queue scheduling algorithm was proposed. The algorithm used job classifications, dynamic priority adjustment, shared resource pool and job preemption to realize fairness, simplify the scheduling flow of normal jobs and improve concurrency. Comparison experiments with First In First Out (FIFO) algorithm were given under three kinds of situations, including that the percentage of short jobs is high, the percentages of all types of jobs are similar, and the general jobs are major with occasional long and short jobs. The proposed algorithm reduced the running time of jobs. The experimental results show that the execution efficiency increase of the proposed algorithm is not obvious when the major jobs are short ones; however, when the assignments of all types of jobs are balanced, the performance is remarkable. This is consistent with the algorithm design rules: prioritizing the short jobs, simplifying the scheduling flow of normal jobs and considering the long jobs, which improves the scheduling performance.

Reference | Related Articles | Metrics